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Abstract — In two-way OFDM relay, carrier frequency offsets 
(CFOs) between relay and terminal nodes introduce severe inter- 
carrier interference (ICI) which degrades the performance of 
traditional physical-layer network coding (PLNC). Moreover, 
traditional algorithm to compute the posteriori probability in 
the presence of ICI would incur prohibitive computational 
complexity at the relay node. In this paper, we proposed a two- 
step asynchronous PLNC scheme at the relay to mitigate the 
effect of CFOs. In the first step, we intend to reconstruct the ICI 
component, in which space-alternating generalized expectation- 
maximization (SAGE) algorithm is used to jointly estimate the 
needed parameters. In the second step, a channel-decoding and 
network-coding scheme is proposed to transform the received 
signal into the XOR of two terminals' transmitted information 
using the reconstructed ICI. It is shown that the proposed scheme 
greatly mitigates the impact of CFOs with a relatively lower 
computational complexity in two-way OFDM relay. 

I. Introduction 

Nowadays, there is increasing interest in employing the 
idea of network coding [1] in wireless communication to 
improve the system throughput [2]-[5]. The simplest scenario 
in which network coding can be applied is the two-way relay 
channel (TWRC), as illustrated in Fig. 1. In TWRC, two 
terminal nodes T\ and T 2 exchange statistically independent 
information with the help of a relay node R. Traditionally, 
this process can be achieved within four time slots, that is, 
T x -> R, R -> T 2 , T 2 ->• R and R ->■ T x , as illustrated 
in Fig. 1(a). To enhance the system throughput of TWRC, 
physical-layer network coding (PLNC) has been introduced in 
[6], PLNC reduces the required time slots for one round of 
information exchange from four to two comparing with the 
traditional protocol, as shown in Fig. 1(b). 

In this paper, we consider the OFDM modulated TWRC 
or two-way OFDM relay (TWOR). A key issue in practical 
application of PLNC in TWOR is how to deal with the 
frequency asynchrony between the signals transmitted by the 
two terminal nodes. That is, symbols transmitted by different 
terminals may arrive at the relay node with different CFOs. 
Due to the impact of CFOs, traditional channel-decoding and 
network-coding mapping method in [6] suffers from severely 
performance degradation. Moreover, traditional algorithm [7] 
to compute the posteriori probability at the relay node may 
introduce prohibitively expensive computation for practical 
implementation due to correlations among the received sam- 
ples caused by ICI. On the other hand, the OFDM modulated 
PLNC assigns the same subcarrier to both terminals which is 
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Fig. 1: A system model for two-way relay channel: (a) 
Traditional four-slot protocol; (b) PLNC. 



very different from OFDMA where the subcarriers of different 
users are orthogonal. That is to say, the received signal in each 
subcarrier is the composition of symbols transmitted by T\ and 
T2. Due to this observation, traditional CFO compensation 
methods developed for OFDMA [8] [9] are difficult to be 
utilized in the PLNC system. In [10], Lu investigates the fre- 
quency asynchronous PLNC for OFDM system and proposes 
a method to compensate the CFOs with the mean of two 
terminals' estimated CFOs at the relay node. Unfortunately, 
this scheme will not perform well when the relative CFO 
between the two terminals becomes larger. 

In this paper, we develop a two-step asynchronous PLNC 
scheme at the relay node. Comparing with the previous work: 
1) The proposed method can effectively mitigate the effect 
of frequency offsets in TWOR system; 2) It can cope with 
the situation that the relative CFO is larger without incurring 
severe performance degradation with respect to perfectly syn- 
chronized system; 3) The proposed scheme has a relatively 
lower computational complexity. 

Notation: Lower and upper case bold symbols denote col- 
umn vectors and matrices, respectively. (-) T , (•)*, and (-) H 
denote transpose, complex conjugate, and Hermitian transpose, 
respectively. E (•) stands expectation operation, diag (•) denote 
a diagonal matrix. Let 1? denote the estimate of §, Let | ■ | 
and || • || denote the magnitude and the Euclidean norm, 
respectively. We use In and Ojvxm for the N x N identity 
matrix and N x M matrix with all zero entries, respectively. 

II. System Model 

We consider the TWOR network as shown in Fig. 1, where 
T\ and T 2 exchange statistically independent information with 
the help of node R. It is assumed that all nodes are half-duplex, 
that is, a node cannot transmit and receive simultaneously. It is 



also assumed that each node is equipped with a single antenna 
and no direct link is existed between Ti and T 2 . 

We consider the two-phase transmission scheme which 
consists of a multiple-access (MAC) phase and a broadcasting 
(BC) phase as illustrated in Fig. 1(b). During the MAC 
phase, terminals TI and T2 send OFDM modulated signals 
to the relay node R simultaneously. Let b i7 Ci and m 
denote the uncoded source vector, channel coded vector and 
the modulated vector of terminal Tj, respectively. Let N 
denote the total number of subcarriers. Let Xi{n) denote 
the nth frequency domain OFDM block of node Tj, where 
Xi(n) = [x it o(n), Xis (n), • • • ,x itN -i(n)] T , i e {1,2}. We 
define A; as the subcarrier allocation matrix, 



l^-i]q,k — 



1 if the qth subcarrier is allocated to the fcth 

element of «i(n) 
if the qth subcarrier is not allocated to Tj 

(1) 

Then we have Xi (n) = AiUi (n). Notably, during 
the MAC phase, T\ and T 2 are allocated a same 
subset of K subcarriers due to the application of 
PLNC, so we can obtain that Ai = A 2 = A. Let 
h t = [hi(0),hi(l),--- ,hi(Li - l),0^_ ii)xl ] T denote the 
channel impulsive response (CIR) between Tj and the relay 
node. Here we assume that the length of cyclic prefix (CP) 
N g > max{Ti,i 2 } to avoid the inter-block interference 
(IBI). Therefore, we concentrate only on the nth OFDM 
block and omit the index n in the rest of this work. Then the 
received signal samples at node R in the end of the MAC 
phase can be expressed as 

y R = E{e 1 )FX 1 Dh 1 + E{e 2 )FX 2 Dh 2 + w, (2) 
in which 

• E&) = diag{l, e> 2n£ */ N , • • • , e i27r(jv-i) e< /ivj and £% is ^ 
normalized CFO for node 7$; 

• F is an N x N matrix with elements [F] P:q = 
^fJ]y e j2irp q /N f 0T o < p, q < N - 1; 

• Xi = diag{xifl,Xi t i, ■ ■ ■ ,x it N^i} T is a diagonal matrix; 

• D is a Fourier matrix with elements [-D] p , 9 = e ~i 27 * pq / N 
for < p, q < N - 1; 

• w = [w(0), w(l), ■ ■ ■ ,w(N - 1)] T is the additive white 
Gaussian noise vector with zero mean and covariance matrix 
a w I N . 

By multiplying both sides of (2) by matrix F H , we ob- 
tain the frequency domain received samples which can be 
expressed as 

y r = a t (niSi + n 2 s 2 ) + w 



: A (A1S1 + A 2 S 2 ) 

" v ' 

Desired signal 

A T ((ni - Ai) 5! + (n 2 - A 2 ) S 2 ) +w, 

S v ' 

ICI 



(3) 



in which IL = F H E(si)F is defined as the interference 
matrix, Aj = diag and Si = XiDhi. Obviously, 



Ai = Ijy and (II j — Aj) = Oat x jv for synchronous case. 
In (3), we can see that, with non-zero CFOs, each output 
symbol is affected by ICI from all other subcarriers due to the 
loss of orthogonality among subcarriers. This results in poor 
performance for traditional channel-decoding and network 
coding mapping method [6]. 

Using the proposed mapping scheme for asynchronous 
PLNC detailed in the next section, relay node R transforms 
the received superimposed signal in the presence of ICI into 
the XORed massages foi © b 2 = T (Y R ). After that, in the 
BC phase, relay then broadcasts bi © b 2 . Both Ti and T 2 try 
to decode bi © b 2 from their corresponding received signals. 
Since T\ (T 2 ) knows its own bits, after decoding b\ © b 2 , it 
can extract the bits transmitted by T 2 (Ti) from the XORed 
massages by subtracting its own information. 

III. Proposed Scheme 

For frequency asynchronous PLNC, a critical challenge is 
how to map the received signal at the relay node into the XOR 
of two terminals' transmitted information. In this section, we 
present a two-step asynchronous PLNC scheme to deal with 
this problem. In the first step, we intend to reconstruct the 
ICI component in (3), in which SAGE algorithm is employed 
to jointly estimate 1 e = [ei,e 2 ] T , h = \hj,h 2 ] and X = 
Xi, X\ . Here we suppose a coarse CFO compensation has 
been operated before the uplink frame, as a result, we need 
to concentrate only on the situation that CFOs are less than 
half of the subcarrier spacing, i.e., —1/2 < < 1/2, for 
i = 1,2. Secondly, using the reconstructed ICI, an channel- 
decoding and network-coding scheme for asynchronous PLNC 
is performed to map the received signal into the XOR of two 
terminals' transmitted information. 



A. SAGE Based ICI Reconstruction 

T 



Let 9 = £ , h , X denote a set of parameters to 
be estimated from the observed data y R with conditional 
probability density function p(y R \6). Obviously, the 
maximization problem of p(y R \9) with respect to the 
unknown parameters 6 is equivalent to the maximization of 
the log-likelihood function which is given by 

L{6) = - \\\y R - E ( £l ) FX.Dh, - E (e 2 ) FX 2 Dh 2 \\ 2 
+ const. 

(4) 

Then we should consider parameter estimation from the 
viewpoint of maximizing L(0), that is 



6 = argmaxT(0). 





(5) 



However, direct computation of the maximization problem 
would require an exhaustive search over multiple-dimensional 

'The reason why we update the CFOs during the payload is two-fold: i) It 
is necessary to estimate the residual CFO due to the estimation error in the 
preamble; ii) For scenarios with time-varying CFOs, reconstructing the ICI 
with estimates from the preamble may results in poor performance. 



space spanned by e = [£1,62] , h = \h[,h\ 



and 



X = 



X x , Xi 



, which may incur prohibitively expensive 
computation for practical implementation. To reduce the com- 
putational complexity, we propose a SAGE based scheme to 
estimate the multiple-dimensional parameters iteratively. 

To operate the SAGE algorithm for asynchronous PLNC, 
we should divide the parameters to be estimated into two 
r 1 T 

groups of 6i = 6i,h i , X i , for i = 1, 2. A hidden space 
[11] must be chosen for each group so that the update process 
of one group can be taken place while the other is kept fixed 
at its latest value. Here we define the hidden space as 



y t =E(si) FXiDh, + w. 



(6) 



In (6), we include all the noise to the hidden space of Qi. [11] 
has shown that such a choice is optimal to reduce the Fisher 
information and increase the convergence rate. 

The update process of i7 Vi e {1, 2}, at (m + l)th iteration 
can be described as follow: 

1) Expectation-step: In this step, we define the conditional 
log-likelihood function [11] or Q function of 8i, that is 

Q^l^^^jlog^y^^r)^,^}, (7) 

in which p (y^di, o\ ^ is the conditional probability density 
function of y i7 



p(y,\eJ? ] ) 



\N 



oxp< - 



1 



Vi-EisJFXiDhi 



(8) 



where i — {1, 2} / {i}. Substitute (8) into (7) and remove the 
terms that do not relate to 6 i7 we can rewrite (7) as 



Q{e i \e [m] ) 



ReUy 



H 



E (£;) FX l Dh l \ , (9) 



in which y^ is the estimate of y i at the (to + l)th iteration 
and 



y R -E[ip)FX\ Dh\ 



(10) 



2) Maximization-step: In this step, we update the value of 
£j, hi and Xi sequentially. The channel estimation at the 
(to + l)th iteration can be obtained by maximizing (9) with 
respect to hi while fixing £j and Xi to their latest estimates, 
i.e., 



L [m+1] 
h i 

in which 
K -- 



arg max Re j (y\ n] ) " E (sf ] ) FX^Dh, 

kd h (x [ r ] ) H f h e h (e [ r ] )y i r\ 



alR^ + D-(x [ r ] ) H X [ r ] D 



(11) 



(12) 



and Ri is the covariance matrix of h t , Ri = E jhjh^ j. It 
is seen that (11) is equivalent to the MMSE estimation [12] 
obtained with the latest estimates of £j and Xi. 

The frequency offset estimation at the (to + l)th iteration 
can be obtained by maximizing (9) with respect to Si while 
keeping hi and Xi fixed at their latest value, i.e., 



£• 



[m+l] 



argmaxRe { (y™) * E ( e< ) FX^Dh^ \ 



13) 

To cope with the nonlinear problem in (13), we assume 
£j — is sufficient small such that we can replace 

e- 7 « with its Taylor's series expansion around e\ to the 
second order term, i.e., 



N 1 2VN 



(14) 

in which Ae' m] = e t - Substitute (14) into (13) and 
remove the terms that do not relate to £,, we have 



Jm+l] _ An 



N-l 



N 



n=0 



j2-Ke . n 
TV 



271 E 1 « 2 Re { (y l r ] (n)) * nl ml (n) cxp J : : 



n=0 



(15) 

[ml " [ m ] ~ [^+1] 

in which we let fi- = FX i Dh L . In order to update 
the value of X l7 we replace equation (10) with 

y' m] =E( Si ) FXiDhi + l\ m] +w, (16) 



FX " Dh 



in which If J = E (e ■) FX.Dh- - E | 
is the residual interference from Tj after the mth iteration. 
Note that ll m ' (fc) is a linear function of all symbols trans- 
mitted by Tj. Therefore, it is rational to assume that I- m ^ is 
nearly Gaussian distributed with zero mean and covariance 
matrix (TjIn following the central limit theorem. Then we 
obtain 

- I'm + 11 - ^ 

X\ 1 (k, k) = arg min Y l (k) - X t (k, k) Hi (jfc) , 



Xi(k,k) 



in which Y t = A T F H E H (4 m+11 ) y[ ml and i?, 

m+l] 



(17) 



Dh\ 
Let 

9 = 



denote the final estimate of 



e T ,h T ,X q 



after M iterations. Then the ICI 



component in (3) can be reconstructed by 

i R = (ni-Ai) 1 Si + (n 2 -A 2 )s2, (is) 

in which IT, = F H E F, A, = diag (ri;) and S l = 
X t DL. 



B. Channel-Decoding and Network-Coding Scheme 



In this subsection, we investigate the channel-decoding and 
network-coding scheme for frequency asynchronous PLNC. 
Notably, for synchronous PLNC, the channel-decoding and 
network-coding at the relay node consists of the following 
two steps [7]. 

Step-1: In the first step, the relay maps received samples 
Y R into the XORed massages bi © b 2 by function T, 
Specifically, the relay firstly computes the posteriori 
probability p (u\ (k) , u 2 (k) \Y R (k)) from the received 
samples. Then the log-likelihood ratios (LLRs) of network- 
coded information can be obtained by 



$ (ci (I) © c 2 (0) 

I,., „. E 

log 



p( Ul (k),u 2 (k)\Y R (k)Y 



(ui(fc),u 2 (fc)):ci(Z)ee 2 (0 = l 



£ p( Ul (k),u 2 (k)\Y R (k)) 

\(ui(fc),U2(fc)):ci(Oec2(0=0 

(19) 

Since with the same linear channel code at both source nodes, 
the XOR of two codewords c\ © c 2 is also a valid codeword. 
Therefore, the relay can directly perform channel decoding 
over $ (ci (I) © c 2 (I)) to obtain foi © b 2 . 

Step-2: In the second step, the relay re-channel encodes 61© 
b 2 and broadcasts the coded information in the BC phase. 

For asynchronous case, we need to compute the posteriori 
probability p (u\ (k) , u 2 (k) \Y R , e, h) in order to apply (18). 
However, the major difficulty occurs here is that it takes 
an exhaustive search over 2(N — 1) dimensional space to 
compute this probability even for BPSK modulation with 
perfect knowledge of CIRs and CFOs. This is caused by 
correlations in the received samples. That is, due to the loss 
of orthogonality among subcarriers, each received symbol is 
affected by the interference from all other subcarriers as shown 
in (3). Consequently, each sample is correlated with all other 
samples. 

To circumvent this obstacle, we present a three-step process 
to perform the channel-decoding and network-coding mapping 
at the node R. 

Step-1: The first step is referred to as the interference 
cancellation step. In this step, we intend to remove the ICI 
component in (3) using the reconstructed ICI presented in 
(18). By removing the ICI component from the frequency 
domain received samples, we obtain 



Ir)+W, (20) 



Y' R = A T (A1S1 +A 2 S 2 ) + A 7 



in which I R = (IIi - Ai) Si + (n 2 - A 2 ) S 2 . 

Step-2: From (18), it is seen that I R is the estimate 
of I R . Here we suppose the elements of I R — I R or the 
estimation errors are sufficient small so that we can compute 
p (ui (k) , u 2 (k) \Y R , e, h) approximately by 



p («i (k) = a, u 2 (k) = b\Y R , e, h) w 

Cexp (k) - oTi (k) - bT 2 (fc)| 2 | 



(21) 
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Fig. 2: BER performance versus number of SAGE itera- 
tions for different normalized CFOs: (a) SNR=10dB; (b) 
SNR=15dB. 



in which F; = A AiDhi and C is a constant independent of 
U\ and u 2 . Then LLRs of the network-coded codewords could 
be computed by (18). After that, channel decoder is employed 
to map the LLRs into b\ © b 2 . 

Step-3: This step is identical with the Step-2 for synchronous 
PLNC. 

C. Complexity Analysis 

In this subsection, we study the computational complexity 
of the proposed scheme. Note that multiplications by matrices 
F and D are equivalent to DFT(IDFT) operations, which 
could be efficiently computed by FFT with Alog 2 N complex 
additions and A/21og 2 A complex multiplications, respec- 
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Fig. 3: BER performance versus normalized CFO, SNR is set 
to 15dB. 



tively. Multiplications by matrices E (ii) and Xi require 
N and K complex multiplications, respectively. Therefore, 
it is shown that the total computational complexity for each 
SAGE iteration is 6Mog 2 A + 2N + K complex additions 
and 3N\og 2 N + 57V + K complex multiplications. Also, 
we can obtain that the computational load for (18)-(20) is 
6Nlog 2 N + IK complex additions and 3Mog 2 A + 28K 
complex multiplications. According to the analysis above, 
it is seen that the overall complexity involved in the pro- 
posed two-step asynchronous PLNC scheme is approximately 
(12M + 6) N\og 2 N+AMN+{2M + 6) K complex additions 
and (6M + 3) N\og 2 N + 10MN + (2M + 28) K complex 
multiplications, in which M is the maximum number of SAGE 
iterations. 

Notably, for scenarios that the CFOs are nearly constant, 
(15) can be computed only in the first block after the preamble 
to further reduce the computational complexity. 

IV. Numerical Results 

In this section, simulation results of the proposed scheme 
are presented. For the simulation setup, we consider a TWOR 
system with N = 128 subcarriers. For simplicity, we allocate 
all the subcarriers to each terminal. BPSK modulation is 
assumed. Quasi-cyclic LDPC code [13] with codewords of 
length 1270 and code rate 1/2 is chosen and all nodes 
are assumed to use the same channel code. Channels be- 
tween terminal nodes and relay node are modeled as six-tap 
frequency-selective fading and the power delay profile of CIR 
is presented as E \hi (l)\ 2 oc e~ l l 2 for I = 0, 1, • • • , Li — 1 

and £ ^l/iiCOl 2 = 1- 

1=0 

It is assumed that the uplink frame of each terminal consists 
of 10 OFDM blocks. At the beginning of each frame, a 
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Fig. 4: BER performance versus SNR with constant CFOs in 
one uplink frame, where e is set to e = [0.1, —0.1] . 



a 0.01 - 




-without CFO compensation 

- proposed 

- mean operation 

- ideal 



— i — 1 — i — 1 — i — 1 — i — 1 — i — 1 — i — 1 — i — 1 — r 

2 4 6 8 10 12 14 16 



SNR(dB) 

Fig. 5: BER performance versus SNR with time-varying CFOs 
in one uplink frame, where £ is set to 0.1. 



preamble [10] is employed to estimate the CIRs which will be 
utilized to initial the iteration at the first OFDM block. The 
initial CFOs are set to s' ' = [0, 0] T . The CIRs are supposed 
to be constant in one uplink frame and the final estimates of 
e and h at the last OFDM block are utilized to initial the next 
block. 

In Fig. 2, the BER performance versus number of SAGE 
iterations for different normalized CFOs is presented, where 
the SNRs are set to lOdB and 15dB. We set the CFOs as 
a function of £, that is, e = £ • [— 1,1] T in which £ is 
modeled as a deterministic scalar belonging to interval [0, 0.5]. 
The synchronous system with perfect knowledge of CIR is 
also considered to provide a benchmark. As shown in the 
figure, two iterations are sufficient for convergence of the 



proposed scheme. Hence, we fix the maximum number of 
SAGE iterations M to 2 in the rest of this section. 

In Fig. 3, we present the BER performance of the proposed 
scheme as a function of normalized CFO. The normalized 
CFOs are also set as e = £ • [— 1, 1] T in which £ varies 
between -0.15 and 0.15. The compensation scheme proposed 
in [10] (This scheme is referred to as the mean operation.) and 
the synchronous system are also presented for comparison. 
As shown in the figure, the BER performance of both the 
proposed scheme and the mean operation degrades as the 
increase of normalized CFO. However, we can see that the 
proposed scheme remarkably outperforms the mean operation 
as well as the curve without CFO compensation. 

In Fig. 4 and Fig. 5, the BER performance versus SNR for 
the proposed scheme is depicted. The CFOs are assumed to 
be constant during each uplink frame in Fig. 4. So the CFO 
estimation in (15) is operated only in the first block after the 
preamble. It is seen from the figure that the proposed scheme 
effectively mitigates the effect of frequency offsets in OFDM 
modulated PLNC. Particularly, the SNR loss is approximately 
0.5dB at a BER of 10~ 3 for the case e = [-0.1, 0.1] T . In Fig. 
5, we assume the CFO varies as a sinusoidal function of block 
index n with an amplitude of %5 of the intercarrier spacing 
[14], i.e., ei (n) = (-l) l £ + 0.05sin(§7m), n = 1, 2, • • • , 10. 
Also, it can be observed that the proposed scheme remarkably 
outperforms the mean operation and the scheme without CFO 
compensation. The SNR loss is approximately 1.5dB at a BER 
of 10~ 3 for £ = 0.1. 

In Fig. 6, we compare the BER performance of the proposed 
scheme with the mean operation for different relative CFOs. 
Here we define the relative CFO as \ei — £2). Without loss of 
generality, we set E\ = 0.05 and let £2 vary between —0.15 
and 0.15. It is seen from the figure that BER performance of 
the mean operation deteriorates greatly as the relative CFO in- 
creases. However, our proposed scheme remarkably mitigates 
the performance degradation at the whole observation interval. 

V. Conclusions 

In this paper, we propose a two-step scheme to cope with 
the frequency asynchrony in TWOR. In the proposed scheme, 
SAGE algorithm is applied to reconstruct the ICI component 
from received signal at the relay. Then a channel-decoding 
and network-coding scheme is employed to map the received 
samples into the XOR of two terminals' information. It can 
be shown that the proposed scheme greatly mitigates the 
degradation due to CFOs with a relatively lower complexity 
and is robust to larger relative CFO comparing with the 
existing strategy. 
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